Sentence Level Sentiment Analysis

R
Python
Sentiment Analysis
Reticulate
by calling Python in R
Author
Affiliation

Xinzhuo Huang

HKUST SOSC

Published

February 20, 2023

Modified

November 2, 2023

Text analysis enables the identification and extraction of sentiment information from text. By leveraging tools such as asent, sentimentr, and basic sentiment lexicons, it is possible to construct a sentiment classifier that can estimate whether the underlying sentiment of a given text is positive, negative, or neutral.

Pacakage Management

Using pacman to manage package dependencies.

Code
require(pacman)
p_load(dplyr, magrittr, furrr, purrr, DT, readr, stringr, furrr, purrr, sentimentr, tidyfst, tidyr, textclean, tidytext, tibble, kableExtra)

Load Data

Our demonstration data are sourced from a news dataset collected from Factiva, consisting of 500 observations. The “link” refers to the links of these news articles. These news articles have been translated into English and undergone thorough text cleaning. Please find the data here.

Code
data <- read_rds("E:/OneDrive - HKUST Connect/SOSC/paper_with_jean/group_meeting/fifth/data_for_analysis_share.Rds")
glimpse(data)
Rows: 500
Columns: 2
$ link         <chr> "https://global.factiva.com/redir/default.aspx?P=sa&an=AF…
$ cleaned_text <chr> "the head of the women tennis association says that he is…

Using asent

To use asent, a Python package in R, you need reticulate package to integrate Python code in your R script. You can change your python interpreter here:

install python packages:

Call python in R.

Code
reticulate::use_python("C:\\Users\\xhuangcb\\anaconda3\\envs\\pytorch_gpu\\python.exe") # call python in R

Then:

Tpying To install
!pip install spacy spacy
!pip install asent asent
!pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.5.0/en_core_web_lg-3.5.0-py3-none-any.whl pre-trained model

from OOP to FP

Python is an object-oriented programming (OOP) language, and it is possible to convert Python classes into functions in R, which mainly support functional programming (FP).

Code
spacy <- reticulate::import("spacy") # load spacy

asent <- reticulate::import("asent") # load asent

nlp <- spacy$load("en_core_web_lg") # load pre-trained model

nlp$add_pipe("asent_en_v1") # add asent pipe
<asent.component.Asent object at 0x000001D8E8A8C940>
Code
nlp_safe <- possibly(.f = nlp, otherwise = "error!") # error handling

get_polarity <- nlp_safe("sentiment")$get_extension("polarity")[[3]] # load python function

get_sentiment_asent <- \(x) {
    result <- nlp_safe(x) %>%
        get_polarity() %>%
        print() %>%
        capture.output() %>%
        str_extract_all("-?0\\.\\d+") %>%
        unlist() %>%
        as.numeric()

    names(result) <- c("neg", "neu", "pos", "compound")

    return(result)
}

Try our function:

Code
get_sentiment_asent(data$cleaned_text[1])
     neg      neu      pos compound 
  0.0440   0.6350   0.0710   0.1235 

It is an analytical process of asent:

Using sentimentr

Our function using sentimentr.

Code
get_sentences_safe <- possibly(.f = get_sentences, otherwise = "error!") # sentences cutter

sentiment_by_safe <- possibly(.f = sentiment_by, otherwise = "error!") # sentiment classifier

get_sentiemnt_sentimentr <- \(x) {
    x %>%
        get_sentences_safe() %>%
        sentiment_by_safe() %>%
        unlist()
}

Try it:

Code
get_sentiemnt_sentimentr(data$cleaned_text[1])
   element_id    word_count            sd ave_sentiment 
    1.0000000   382.0000000     0.3115586     0.1770496 

Word-level

Code
cal_sentiment <- \(x) {
    x %>%
        filter(value == 0) %>%
        pull(value) %>%
        length() -> neu

    x %>%
        filter(value > 0) %>%
        pull(value) %>%
        sum() -> pos

    x %>%
        filter(value < 0) %>%
        pull(value) %>%
        sum() -> neg
    result <- c(neu, pos, neg)

    names(result) <- c("neu", "pos", "neg")

    return(result)
}

get_sentiment_word <- \(x) {
    result <- x %>%
        tibble(text = ., id = 1) %>%
        unnest_tokens(word, text) %>%
        filter(!word %in% stopwords::stopwords(source = "stopwords-iso")) %>%
        left_join(get_sentiments("afinn")) %>% # you can try different dict
        replace_na(list(value = 0)) %>%
        nest(data = !id) %>%
        mutate(sentiment = map(data, cal_sentiment)) %>%
        pull(sentiment) %>%
        unlist()

    return(result)
}

Try it:

Code
get_sentiment_word(data$cleaned_text[1])
neu pos neg 
138  18 -12 

Using these functions in loop

You can use these function with for loop or just use mapto apply a them to each element of a vector or list in tibble.

Code
data <- data %>%
    mutate(sentiment_asent = map(cleaned_text, get_sentiment_asent, .progress = TRUE))